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Abstract. Betweenness is a well-known centrality measure that ranks 
the nodes of a network according to their participation in shortest paths. 
Since an exact computation is prohibitive in large networks, several ap¬ 
proximation algorithms have been proposed. Besides that, recent years 
have seen the publication of dynamic algorithms for efficient recomputa¬ 
tion of betweenness in evolving networks. In previous work we proposed 
the first semi-dynamic algorithms that recompute an approximation of 
betweenness in connected graphs after batches of edge insertions. 

In this paper we propose the first fully-dynamic approximation algo¬ 
rithms (for weighted and unweighted undirected graphs that need not to 
be connected) with a provable guarantee on the maximum approxima¬ 
tion error. The transfer to fully-dynamic and disconnected graphs implies 
additional algorithmic problems that could be of independent interest. 
In particular, we propose a new upper bound on the vertex diameter for 
weighted undirected graphs. For both weighted and unweighted graphs, 
we also propose the first fully-dynamic algorithms that keep track of 
this upper bound. In addition, we extend our former algorithm for semi¬ 
dynamic BFS to batches of both edge insertions and deletions. 

Using approximation, our algorithms are the first to make in-memory 
computation of betweenness in fully-dynamic networks with millions of 
edges feasible. Our experiments show that they can achieve substantial 
speedups compared to recomputation, up to several orders of magnitude. 
Keywords: betweenness centrality, algorithmic network analysis, fully- 
dynamic graph algorithms, approximation algorithms, shortest paths 


1 Introduction 

The identification of the most central nodes of a network is a fundamental 
problem in network analysis. Betweenness centrality (BC) is a well-known in¬ 
dex that ranks the importance of nodes according to their participation in 
shortest paths. Intuitively, a node has high BC when it lies on many shortest 
paths between pairs of other nodes. Formally, BC of a node v is defined as 
cb{v) = n{n-i) > ’''^here n is the number of nodes, ast is the num¬ 

ber of shortest paths between two nodes s and t and ast{v) is the number of these 
paths that go through node v. Since it depends on all shortest paths, the exact 
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computation of BC is expensive: the best known algorithm j4] is quadratic in the 
number of nodes for sparse networks and cubic for dense networks, prohibitive 
for networks with hundreds of thousands of nodes. Many graphs of interest, 
however, such as web graphs or social networks, have millions or even billions 
of nodes and edges. For this reason, approximation algorithms [mu must be 
used in practice. In addition, many large graphs of interest evolve continuously, 
making the efficient recomputation of BC a necessity. In a previous work, we 
proposed the first two approximation algorithms |3] (I A for unweighted and I AW 
for weighted graphs) that can efficiently recompute the approximate BC scores 
after batches of edge insertions or weight decreases. lA and lAW are the only 
semi-dynamic algorithms that can actually be applied to large networks. The 
algorithms build on RK [18], a static algorithm with a theoretical guarantee on 
the quality of the approximation, and inherit this guarantee from RK. However, 
lA and lAW target a relatively restricted configuration: only connected graphs 
and edge insertions/weight decreases. 

Our contributions. In this paper we present the first fully-dynamic algorithms 
(handling edge insertions, deletions and arbitrary weight updates) for BC ap¬ 
proximation in weighted and unweighted undirected graphs. Our algorithms ex¬ 
tend the semi-dynamic ones we presented in |3], while keeping the theoretical 
guarantee on the maximum approximation error. The transfer to fully-dynamic 
and disconnected graphs implies several additional problems compared to the re¬ 
stricted case we considered previously [3] . Consequently, we present the following 
intermediate results, all of which could be of independent interest, (i) We pro¬ 
pose a new upper bound on the vertex diameter VD (i. e. number of nodes in the 
shortest path(s) with the maximum number of nodes) for weighted undirected 
graphs. This can improve significantly the one used in the RK algorithm |18| if 
the network’s weights vary in relatively small ranges (from the size of the largest 
connected component to at most twice the vertex diameter times the ratio be¬ 
tween the maximum and the minimum edge weights), (ii) For both weighted 
and unweighted graphs, we present the first fully-dynamic algorithm for updat¬ 
ing an approximation of VD, which is equivalent to the diameter in unweighted 
graphs, (iii) We extend our previous semi-dynamic BFS algorithm |3] to batches 
of both edge insertions and deletions. In our experiments, we compare our algo¬ 
rithms to recomputation with RK on both synthetic and real dynamic networks. 
Our results show that our algorithms can achieve substantial speedups, often 
several orders of magnitude on single-edge updates and are always faster than 
recomputation on batches of more than 1000 edges. 

2 Related work 

2.1 Overview of algorithms for computing BC 

The best static exact algorithm for BC (BA) is due to Brandes jl] and requires 
0(nm) operations for unweighted graphs and 0{nm + logn) for graphs with 
positive edge weights. The algorithm computes a single-source shortest path 
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(SSSP) search from every node s in the graph and adds to the BC score of 
each node v ^ s the fraction of shortest paths that go through v. Several static 
approximation algorithms have been proposed that compute an SSSP search 
from a set of randomly chosen nodes and extrapolate the BC scores of the other 
nodes mE\- The static approximation algorithm by Riondato and Kornaropou- 
los (RK) [IH] samples a set of shortest paths and adds a contribution to each 
node in the sampled paths. This approach allows a theoretical guarantee on the 
quality of the approximation and will be described in Section |2.2[ Recent years 
have seen the publication of a few dynamic exact algorithms |14I10I12I1111619] . 
Most of them store the previously calculated BC values and additional infor¬ 
mation, like the distance of each node from every source, and try to limit the 
recomputation to the nodes whose BC has actually been affected. All the dy¬ 
namic algorithms perform better than recomputation on certain inputs. Yet, 
none of them is in general better than BA. In fact, they all require updating 
an all-pairs shortest paths (APSP) search, for which no algorithm has an im¬ 
proved worst-case complexity compared to the best static algorithm m- Also, 
the scalability of the dynamic exact BC algorithms is strongly compromised by 
their memory requirement of To overcome these problems, we presented 

two algorithms that efficiently recompute an approximation of the BC scores 
instead of their exact values [3]. The algorithms have shown significantly high 
speedups compared to recomputation with RK and a good scalability, but they 
are limited to connected graphs and batches of edge insertions/weight decreases 


(see Section 2.3). 


2.2 RK algorithm 

The static approximation algorithm RK [18] is the foundation for the incremental 
approach we presented in |3] and our new fully-dynamic approach. RK samples 
a set S = {p(i), ■■■,P(r)} of r shortest paths between randomly-chosen source- 
target pairs (s,t). Then, RK computes the approximated betweenness cb{v) of 
a node v as the fraction of sampled paths S S that go through v, by adding 
- to u’s score for each of these paths. In each of the r iterations, the probability 
of a shortest path pst to be sampled is ’KaiPst) = n{n-i) ' number r of 

samples required to approximate the BC scores with the given error guarantee 
is r = ^ ([log 2 (VD — 2)J -|- 1 -f In |), where e and 6 are constants in ( 0 , 1) and 
c « 0 . 5 . Then, if r shortest paths are sampled according to ttg, with probability 
at least 1 — <5 the approximations cb{v) are within e from their exact value: 
Pr(3u S V s.t. \cb{v) — cb(u)| > e) < < 5 . To sample the shortest paths according 
to TTG, RK first chooses a source-target node pair {s,t) uniformly at random 
and performs a shortest-path search (Dijkstra or BPS) from s to t, keeping also 
track of the number asv of shortest paths between s and v and of the list of 
predecessors Ps{v) (i. e. the nodes that immediately precede v in the shortest 
paths between s and v) for any node v between s and t. Then one shortest path 
is selected: starting from t, a predecessor z G Psit) is selected with probability 
<^sz/T, = <yszl<^st- The sampling is repeated iteratively until node s 

is reached. 
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Approximating the vertex diameter. RK uses two upper bounds on VD that can 
be both computed in 0{n + m). For unweighted undirected graphs, it samples 
a source node Si for each connected component of G, computes a BFS from 
each Si and sums the two shortest paths with maximum length starting in Si. 
The VD approximation is the maximum of these sums over all components. For 
weighted graphs, RK approximates VD with the size of the largest connected 
component, which can be a significant overestimation for complex networks, 
possibly of orders of magnitude. In this paper, we present a new approximation 
for weighted graphs, described in Section]^ 


2.3 lA and lAW algorithms 

lA and lAW are the incremental approximation algorithms (for unweighted and 
weighted graphs, respectively) that we presented previously [3]. The algorithms 
are based on the observation that if only edge insertions are allowed and the 
graph is connected, VD cannot increase, and therefore also the number r of 
samples required by RK for the theoretical guarantee. Instead of recomputing r 
new shortest paths after a batch of edge insertions, lA and lAW replaee each old 
shortest path p^^t with a new shortest path between the same node pair (s,<). 
In I AW the paths are recomputed with a slightly-modified T-SWSF [2], whereas 
lA uses a new semi-dynamic BFS algorithm. The BC scores are updated by 
subtracting 1/r to the BC of the nodes in the old path and adding 1/r to the 
BC of nodes in the new shortest path. 


2.4 Batch dynamic SSSP algorithms 

Dynamic SSSP algorithms recompute distances from a source node after a single 
edge update or a batch of edge updates. Algorithms for the batch problem have 
been published mm and compared in experimental studies m- The exper¬ 
iments show that the tuned algorithm T-SWSF presented in [2] performs well 
on many types of graphs and edge updates. For batches of only edge insertions 
in unweighted graphs, we developed an algorithm asymptotically faster than T- 
SWSF E]. The algorithm is in principle similar to T-SWSF, but has an improved 
complexity thanks to different data structures. 


3 New VD approximation for weighted graphs 

Let G be an undirected graph. For simplicity, let G be connected for now. If it 
is not, we compute an approximation for each connected component and take 
the maximum over all the approximations. Let T C G be an SSSP tree from 
any source node s S P. Let p^y denote a shortest path between x and y in 
G and let p'^y denote a shortest path between x and y in T. Let \pxy\ be the 
number of nodes in p^y and d{x, y) be the distance between x and y in G, and 
analogously for \p^y \ and dF(x,y). Let uj and w be the maximum and minimum 
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edge weights, respectively. Let u and v be the nodes with maximum distance 
from s, i. e. d{s, u) > d{s, v) > d{s, x) yx G V,x ^ u. 

We define the VD approximation VD := 1 + As,u)+d{s,v) ^ 

Proposition 1. VD < VD < 2 • ^ VD. (Proof in Section 

To obtain the upper bound VD, we can simply compute an SSSP search from 
any node s, find the two nodes with maximum distance and perform the remain¬ 
ing calculations. Notice that VD extends the upper bound proposed for RK [18] 
for unweighted graphs: When the graph is unweighted and thus u = uJ, VD 
becomes equal to the approximation used by RK. Complex networks are often 
characterized by a small diameter and in networks like coauthorship, friendship, 
communication networks, VD and ^ can be several order of magnitude smaller 
than the size of the largest component. This translates into a substantially im¬ 
proved VD approximation. 

4 New fully-dynamic algorithms 

Overview. We propose two fully-dynamic algorithms, one for unweighted (DA, 
dynamic approximation) and one for weighted (DAW, dynamic approximation 
weighted) graphs. Similarly to lA and lAW, our new fully-dynamic algorithms 
keep track of the old shortest paths and substitute them only when necessary. 
However, if G is not connected or edge deletions occur, VD can grow and a simple 
substitution of the paths is not sufficient anymore. Although many real-world 
networks exhibit a shrinking-diameter behavior to ensure our theoretical 
guarantee, we need to keep track of VD over time and sample new paths in 
case VD increases. The need for an efficient update of VD augments signifi¬ 
cantly the difficulty of the fully-dynamic problem, as well as the necessity to 
recompute the SSSPs after batches of both edge insertions and deletions. The 
building block for the BC update are basically two: a fully-dynamic algorithm 
that updates distances and number of shortest paths from a certain source node 
(SSSP update) and an algorithm that keeps track of a VD approximation for 
each connected component of G. The following paragraphs give an overview of 
such building blocks, which could be of independent interest. The last paragraph 
outlines the dynamic BC approximation algorithm. Due to space constraints, 
a detailed description of the algorithms as well as the pseudocodes and 
the omitted proofs can be found in the Appendix. 

SSSP update in weighted graphs. Our SSSP update is based on T-SWSF |5|, 
which recomputes distances from a source node s after a batch (3 of weight 
updates (or edge insertions/deletions). For our BC algorithm, we need two ex¬ 
tensions of T-SWSF: an algorithm that also recomputes the number of shortest 
paths between s and the other nodes (updateSSSP-W) and one that also updates 
a VD approximation for the connected component of s (updateApprVD-W). The 
VD approximation is computed as described in Sectionj^ Thus, updateApprVD- 
W keeps track of the two maximum distances d' and d" from s and the minimum 
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edge weight w. We call affected nodes the nodes whose distance (or also whose 
number of shortest paths, in updateSSSP-W) from s has changed as a conse¬ 
quence of /3. Basically, the idea is to put the set A of affected nodes w into a 
priority queue Q with priority p{w) equal to the candidate distance of w. When 
w is extracted, if there is actually a path of length p{w) from s to w, the new 
distance of w is set to p{w), otherwise w is reinserted into Q with a higher can¬ 
didate distance. In both cases, the affected neighbors of w are inserted into Q. In 
updateApprVD-W, d' and d" are recomputed while updating the distances and 
w is updated while scanning ff In updateSSSP-W, the number asw of shortest 
paths of w is recomputed as the sum of the asz of the new predecessors 2 ; of w. 

Let I/I I represent the cardinality of /3 and let ||A|| represent the sum of 
the nodes in A and of the edges that have at least one endpoint in A. Then, 
the following complexity derives from feeding Q with the batch and inserting 
into/extracting from Q the affected nodes and their neighbors. 

Lemma 1. The time required by updateApprVD-W (updateSSSP-W) to update 
the distances and VD (the number of shortest paths) is 0(|/3| log |/3|-|-||A|| log ||A||) 

SSSP update in unweighted graphs. For unweighted graphs, we basically re¬ 
place the priority queue Q of updateApprVD-W and updateSSSP-W with a list of 
queues, as the one we used in [3] for the incremental BFS. Each queue represents 
a level from 0 (which only the source belongs to) to the maximum distance d'. 
The levels replace the priorities and also in this case represent the candidate 
distances for the nodes. In order not to visit a node multiple times, we use col¬ 
ors to distinguish the unvisited nodes from the visited ones. The replacement 
of the priority queue with the list of queues decreases the complexity of the 
SSSP update algorithms for unweighted graphs, that we call updateApprVD-U 
and updateSSSP-U, in analogy with the ones for weighted graphs. 

Lemma 2. The time required by updateApprVD-U (updateSSSP-U) to update 
the distances and VD (the number of shortest paths) is 0(|/?| + ||A|| -|- dmax); 
where dmax is the maximum distance from s reached during the update. 

Fully-dynamic VD approximation. The algorithm keeps track of a VD approx¬ 
imation for the whole graph G, i. e. for each connected component of G. It is 
composed of two phases. In the initialization, we compute an SSSP from a source 
node Si for each connected component C,;. During the SSSP search from s,;, we 


and[^ In the update, we recompute the SSSPs and the VD approximations with 
updateApprVD-W (or updateApprVD-U). Since components might split or merge, 
we might need to compute new approximations, in addition to update the old 
ones. To do this, for each node, we keep track of the number of times it has been 
visited. This way we discard source nodes that have already been visited and 
compute a new approximation for components that have become unvisited. The 
complexity of the update of the VD approximation derives from the VD update 
in the single components, using updateApprVD-W and updateApprVD-U. 


also compute a VD approximation VDi for Ci, as described in Sections 


2.2 
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Theorem 1. The time required to update the VD approximation is 0{nc-\l3\ log |/3| + 

YJi=i log ||AW||) in weighted graphs and 0{nc- \ld\+YH=i \ \ + dmax) 

in unweighted graphs, where is the number of components in G before the 
update and is the sum of affected nodes in Ci and their incident edges. 


Dynamic BC approximation. Let G be an undirected graph with Uc connected 
components. Now that we have defined our building blocks, we can outline a 
fully-dynamic BC algorithm: we use the fully dynamic VD approximation to 
recompute VD after a batch, we update the r sampled paths with updateSSSP 
and, if VD (and therefore r) increases, we sample new paths. However, since 
updateSSSP and updateApprVD share most of the operations, we can “merge” 
them and update at the same time the shortest paths from a source node s and 
the VD approximation for the component of s. We call such hybrid function 
updateSSSPVD. Instead of storing and updating Uc SSSPs for the VD approxi¬ 
mation and r SSSPs for the BC scores, we recompute a VD approximation for 
each of the r samples while recomputing the shortest paths with updateSSSPVD. 
This way we do not need to compute an additional SSSP for the components 
covered by r sampled paths (i.e. in which the paths lie), saving time and mem¬ 
ory. Only for components that are not covered by any of them (if they exist), we 
compute and store a separate VD approximation. We refer to such components 
as R' (and to |i?'| as r'). 

The high-level description of the update after a batch /3 is shown as Algo¬ 
rithm After changing the graph according to /? (Line , we recompute the 
previous r samples and the VD approximations for their components (Lines 
[^. Then, similarly to I A and I AW, we update the BC scores of the nodes in 
the old and in the new shortest paths. Thus, we update a VD approximation 
for the components in R' (Lines and compute a new approximation for 

new components that have formed applying the batch (Lines [^- 12 1. Then, we 


use the results to update the number of samples (Lines 13 - 141. If necessary. 


we sample additional paths and normalize the BC scores (Lines 18 - 211. The 


difference between DA and DAW is the way the SSSPs and the VD approxima¬ 
tion are updated: in DA we use updateApprVD-U and in DAW updateApprVD-W. 
Differently from RK and our previous algorithms lA and lAW, in DA and DAW 
we scan the neighbors every time we need the predecessors instead of storing 
them. This allows us to use 0{n) memory per sample (i. e., 0{{r + r')n) in total) 
instead of 0{m) per sample, while our experiments show that the running time is 
hardly influenced. The number of samples depends on e, so in theory this can be 
as large as \V\. However, the experiments conducted in [3] show that relatively 
large values of e (e. g. e = 0.05) lead to good ranking of nodes with high BC 
and for such values the number of samples is typically much smaller than \V\, 
making the memory requirements of our algorithms significantly less demanding 
than those of the dynamic exact algorithms (l7(n^)) for many applications. 


Theorem 2. Algorithm preserves the guarantee on the maximum absolute 
error, i. e. naming c'g{v) and c'g{v) the new exact and approximated BC values, 
respectively, Pr(3u S V s.t. |c^(u) — c^(u)| > e) < <5. 
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Algorithm 1: BC update after a batch (3 of edge updates 

1 applyBatchCG, / 5 ) ; 

2 for i <r— 1 to r do 

3 T/Z)i e-updateSSSPVD(si,^); 

4 replacePath(Si, ti) ; 

5 end 

6 foreach Ci G R' do 

7 I VTDiupdateApprVD(Gi,/3); 

8 end 

9 foreach unvisited C, do 


10 


11 


add Cj to R'\ 

VDj •<— initApprVD(Gj); 


12 end 

13 VD ^ maxc-gflufl' VDi’, 

14 r„ew ^ (c/£2)([log2(ra - 2)J + ln(l/5)); 

15 if Tnew > r then 

16 sampleNewPathsO ; 

17 foreach v € V do 

18 I cb(ii)- s-Cs(l’) ■ r/rnew ; 

19 end 

20 ] r ^ rnew; 

21 end 

22 return {(i!,Cs(v)) : v € V} 


/* update of BC scores */ 


/* update of BC scores */ 
/* renormalization of BC scores */ 


Theorem 3. Let Ar he the difference between the value of r before and after 
the batch and let be the sum of affected nodes and their incident edges 

in the i-th SSSP. The time required for the BC update in unweighted graphs is 
0((r + r')|/3|(||A(*)|| + dmlx) + ^r(|F| + |if|)). In weighted graphs, it is 
0{{r + r')\/3\\og\f3\ +J2l=i ||AW|| log ||AW|| + Z\r(|y| log |y| + |£;|)). 

Notice that, if VD does not increase, Ar = 0 and the complexities are the 
same as the only-incremental algorithms IA and I AW we proposed in [3|- Also, 
notice that in the worst case the complexity can be as bad as recomputing from 
scratch. However, no dynamic SSSP (and so probably also no BC approximation) 
algorithm exists that is faster than recomputation. 

5 Experiments 

Implementation and settings. We implement our two dynamic approaches DA 
and DAW in C+-I-, building on the open-source NetworKit framework [5n|, which 
also contains the static approximation RK. In all experiments we fix 6 to 0.1 and 
e to 0.05, as a good tradeoff between running time and accuracy [3]. This means 
that, with a probability of at least 90%, the computed BC values deviate at most 
0.05 from the exact ones. In our previous experimental study |3], we showed that 








5. EXPERIMENTS 


Graph 

Type 

Nodes 

Edges 

Type 

repliesDigg 

emailSlashdot 

emailLinux 

facebookPosts 

emailEnron 

facebookFrlends 

arXivCitations 

englishWikipedia 

communication 

communication 

communication 

communication 

communication 

friendship 

coauthorship 

hyperlink 

30.398 
51,083 

63.399 
46,952 
87,273 
63,731 
28,093 

1,870,709 

85,155 

116,573 

159,996 

183,412 

297,456 

817,035 

3,148,447 

36,532,531 

Weighted 

Weighted 

Weighted 

Weighted 

Weighted 

Unweighted 

Unweighted 

Unweighted 


Table 1: Overview of real dynamic graphs used in the experiments. 


for such values of e and S, the ranking error (how much the ranking computed 
by the approximation algorithm differs from the rank of the exact algorithm) 
is low for nodes with high betweenness. Since our algorithms simply update 
the approximation of RK, our accuracy in terms or ranking error does not differ 
from that of RK (see for details). Also, our experiments in |3] have shown that 
dynamic exact algorithms are not scalable, because of both time and memory 
requirements, therefore we do not include them in our tests. The machine used 
has 2x8 Intel(R) Xeon(R) E5-2680 cores at 2.7 GHz, of which we use only one 
core, and 256 GB RAM. 

Data sets and experiments. We concentrate on two types of graphs: synthetic and 
real-world graphs with real edge dynamics. The real-world networks are taken 
from The Koblenz Network Gollection (KONEGT) [T3] and are summarized in 
Table All the edges of the KONEGT graphs are characterized by a time of 
arrival. In case of multiple edges between two nodes, we extract two versions of 
the graph: one unweighted, where we ignore additional edges, and one weighted, 
where we replace the set Est of edges between two nodes with an edge of weight 
\/\Est\- In our experiments, we let the batch size vary from 1 to 1024 and for 
each batch size, we average the running times over 10 runs. Since the networks 
do not include edge deletions, we implement additional simulated dynamics. In 
particular, we consider the following experiments, (i) Real dynamics. We remove 
the X edges with the highest timestamp from the network and we insert them 
back in batches, in the order of timestamps, (ii) Random insertions and deletions. 
We remove x edges from the graph, chosen uniformly at random. To create 
batches of both edge insertions and deletions, we add back the deleted edges 
with probability 1/2 and delete other random edges with probability 1/2. (iii) 
Random weight changes. In weighted networks, we choose x edges uniformly at 
random and we multiply their weight by a random value in the interval (0, 2). 

For synthetic graphs we use a generator based on a unit-disk graph model 
in hyperbolic geometry m, where edge insertions and deletions are obtained 
by moving the nodes in the hyperbolic plane. The networks produced by the 
model were shown to have many properties of real complex networks, like small 
diameter and power-law degree distribution (see m and the references therein). 
We generate seven networks, with \E\ ranging from about 2 • 10"*^ to about 2 • 10^ 
and \V\ approximately equal to |if|/10. 
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repliesDigg 

emailSIashdot 

emailLinux 

• facebookPosts 
emailEnron 

* facebookFriends 

■ arXivCitations 

■ englishWikipedia 


Fig. 1: Speedups of DA on RK in real unweighted networks using real dynamics. 



Real 

Random 

Time [sj 

Speedups 

Time [sJ 

Speedups 

Graph 

\m = 1 

\fi\ = 1024 


l/^l = 1024 

\ti\ = 1 

|/4| = 1024 

II 

1^1 = 1024 

repliesDigg 

0.078 

1.028 

76.11 

5.42 

0.008 

0.832 

94.00 

4.76 

emailSIashdot 

0.043 

1.055 

219.02 

9.91 

0.038 

1.151 

263.89 

28.81 

emailLinux 

0.049 

1.412 

108.28 

3.59 

0.051 

2.144 

72.73 

1.33 

facebookPosts 

0.023 

1.416 

527.04 

9.86 

0.015 

1.520 

745.86 

8.21 

emailEnron 

0.368 

1.279 

83.59 

13.66 

0.203 

1.640 

99.45 

9.39 

facebookFriends 

0.447 

1.946 

94.23 

18.70 

0.448 

2.184 

95.91 

18.24 

arXivCitations 

0.038 

0.186 

2287.84 

400.45 

0.025 

1.520 

2188.70 

28.81 

englishWikipedia 

1.078 

6.735 

3226.11 

617.47 

0.877 

5.937 

2833.57 

703.18 


Table 2: Times and speedups of DA on RK in unweighted real graphs under real 
dynamics and random updates, for batch sizes of 1 and 1024. 


Speedups. Figure [l] reports the speedups of DA on RK in real graphs using real 
dynamics. Although some fluctuations can be noticed, the speedups tend to de¬ 
crease as the batch size increases. We can attribute fluctuations to two main 
factors: First, different batches can affect areas of G of varying sizes, influencing 
also the time required to update the SSSPs. Second, changes in the VD ap¬ 
proximation can require to sample new paths and therefore increase the running 
time of DA (and DAW). Nevertheless, DA is significantly faster than recompu¬ 
tation on all networks and for every tested batch size. Analogous results are 
reported in Figure of the Appendix for random dynamics. Table summarizes 
the running times of DA and its speedups on RK with batches of size 1 and 
1024 in unweighted graphs, under both real and random dynamics. Even on the 
larger graphs (arXivCitations and englishWikipedia) and on large batches, 
DA requires at most a few seconds to recompute the BC scores, whereas RK 
requires about one hour for englishWikipedia. The results on weighted graphs 
are shown in Table in Section 0 in the Appendix. In both real dynamics and 
random updates, the speedups vary between « 50 and « 6 • 10^ for single-edge 
updates and between « 5 and « 75 for batches of size 1024. On hyperbolic 
graphs (Figure]^, the speedups of DA on RK increase with the size of the graph. 
Table 1^ in the Appendix contains the running times and speedups on batches of 
1 and 1024 edges. The speedups vary between « 100 and « 3 • 10^ for single-edge 
updates and between « 3 and « 5 • 10^ for batches of 1024 edges. The results 
show that DA and DAW are faster than recomputation with RK in all the tested 
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m = 20000 
— m = 50000 
m = 200000 
rti = 500000 
— rti = 2000000 
m = 5000000 
m = 20000000 


Fig. 2: Speedups of DA on RK in hyperbolic unit-disk graphs. 


instances, even when large batches of 1024 edges are applied to the graph. With 
small batches, the algorithms are always orders of magnitude faster than RK, 
often with running times of fraction of seconds or seconds compared to minutes 
or hours. Such high speedups are made possible by the efficient update of the 
sampled shortest paths, which limit the recomputation to the nodes that are 
actually affected by the batch. Also, processing the edges in batches, we avoid 
to update multiple times nodes that are affected by several edges of the batch. 


6 Conclusions 


Betweenness is a widely used centrality measure, yet expensive if computed ex¬ 
actly. In this paper we have presented the first fully-dynamic algorithms for be¬ 
tweenness approximation (for weighted and for unweighted undirected graphs). 
The consideration of edge deletions and disconnected graphs is made possible by 
the efficient solution of several algorithmic subproblems (some of which may be 
of independent interest). Now BC can be approximated with an error guarantee 
for a much wider set of dynamic real graphs compared to previous work. 

Our experiments show significant speedups over the static algorithm RK. In 
this context it is interesting to remark that dynamic algorithms require to store 
additional memory and that this can be a limit to the size of the graphs they can 
be applied to. By not storing the predecessors in the shortest paths, we reduce 
the memory requirement from 0[\E\) per sampled path to 0(|C|) - and are still 
often more than 100 times faster than RK despite rebuilding the paths. 

Future work may include the transfer of our concepts to approximating other 
centrality measures in a fully-dynamic manner, e. g. closeness, and the extension 
to directed graphs, for which a good VD approximation is the only obstacle. 
Moreover, making the betweenness code run in parallel will further accelerate 
the computations in practice. Our implementation will be made available as part 
of a future release of the network analysis tool suite NetworKit [20] . 
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A. DESCRIPTION OF THE FULLY-DYNAMIC ALGORITHMS 


A Description of the fully-dynamic algorithms 

A.l Dynamic VD approximation 

Algorithm|^describes the initialization. Initially, we put all the nodes in a queue 
and compute an SSSP from the nodes we extract. During the SSSP search, we 
mark as visited all the nodes we scan. When extracting the nodes, we skip those 
that have already been visited: this avoids us to compute multiple approxima¬ 
tions for the same component. In the update (Algorithm]^, we recompute the 
SSSPs and the VD approximations with updateApprVD-W (or updateApprVD- 
U). Since components might split, we might need to add VD approximations 
for some new subcomponents, in addition to recompute the old ones. Also, if 
components merge, we can discard the superfluous approximations. To do this, 
we keep track, for each node, of the number of times it has been visited. Let 
vis{v) denote this number for node v. Before the update, all the nodes are vis¬ 
ited exactly once. While updating an SSSP from Si, we increase (decrease) by 
one vis(y) of the nodes v that become reachable (unreachable) from Si. This way 
we can skip the update of the SSSPs from nodes that have already been visited. 
After the update, for all nodes v that have become unvisited {vis{v) = 0), we 
compute a new VD approximation from scratch. 


Algorithm 2: Dynamic VD approximation (initialization) 


1 

2 

3 

4 

5 

6 

7 

8 
9 

10 

11 

12 

13 

14 

15 

16 


foreach node v £V do 

vis(v) 0; insert v into U ; 
end 
i ^— 1 ^ 

while U A ^ do 

1 extract s from U ; 
if vis(s) = 0 then 

Si — S] 

// initApprVD adds 1 to vis{v) 
VDi e- initApprVD (G, Si); 
i i — f -f 1; 

end 

end 

nc ■<— i — 1; 

VD maxi=i,,,.,„p VDi' 
return VD 


of the nodes it visits 


A.2 Dynamic SSSP update for weighted graphs 

Algorithm describes the SSSP update for weighted graphs. The pseudocode 
updates both the VD approximation for the connected component of s and the 
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Algorithm 3: Dynamic VD approximation (updateApprVD) 


1 

2 

3 

4 

5 

6 


7 

8 
9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 
21 


U ^ []; 

foreach Si do 

I if vis(si) > 1 then 

I remove Si and VDi’, decrease nc\ 

end 

else 

// updateApprVD updates vis, inserts all v for which vis(v) = 0 
into [/ and computes a VD approximation VDi 
VDi updateApprVD(G, Si) ; 

end 
end 

i ^ nc ; 
while ?7 A 0 do 

extract s' from U; 
if vis(s') = 0 then 
s'i <— s'; 

VDi initApprVDCG, Si) ; 

i-s—i + 1; no nc + 1; 

end 

end 

reset vis{v) to 1 for nodes v such that vis{v) > 1; 

VD -s— maxi=i,,,.,„^ VDi; 

retnrn VD 


number of shortest paths from s, so it basically includes both updateSSSP-W 
and updateApprVD-W. Initially, we scan the edges e = {u,z;} in jS and, for each 
e, we insert the endpoint with greater distance from s into Q (w.l.o.g., let v 
be such endpoint). The priority p{v) of v represents the candidate new distance 
of V. This is the minimum between the d{v) and d{u) plus the weight of the 
edge {u,!;}. Notice that we use the expression "insert v into Q” for simplicity, 
but this can also mean update p{v) if v is already in Q and the new priority is 
smaller than p{v). When we extract a node w from Q, we have two possibilities: 
(i) there is a path of length p{w) and p{w) is actually the new distance or (ii) 
there is no path of length p{w) and the new distance is greater than p(w). In 
the first case (Lines |^- 231, we set d{w) to p{w) and insert the neighbors z of 
w such that d{z) > d^w) + wdic, z}) into Q (to check if new shorter paths to 
z that go through w exist). In the second case (Lines p4|-[40|) , we assume there 
is no shortest path between s and w anymore, setting d{w) to oo. We compute 
p{w) as min{„ d{v) +uj(v, w) (the new candidate distance for w) and insert 
w into Q. Also its neighbors could have lost one (or all of) their old shortest 
paths, so we insert them into Q as well. The update of w can be done while 
scanning the batch and of d' and d" when we update d{w). When updating 
d{w), we also increase vis{w) in case the old d{w) was equal to oo (i.e. w has 


15 










A. DESCRIPTION OF THE FULLY-DYNAMIC ALGORITHMS 


become reachable) and we decrease vis(w) when we set d(w) to oo (i.e. w has 
become unreachable). We update the number of shortest paths after updating 


d(w), as the sum of the shortest paths of the predecessors of w (Lines 16 - 181. 


Algorithm 4: SSSP update for weighted graphs (updateSSSP-W) 

1 Q -f- empty priority queue; 

2 foreach e = {u, v} € P, d{u) < d{v) do 

3 I QinsertOrDecreaseKey(u,p(f) = min{d(u) + a;({u,u}),d(io)}); 

4 end 

5 ^ min{^, Lj{e) : e ^ /?}; 

6 while there are nodes in Q do 


7 

8 
9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 
21 
22 

23 

24 

25 

26 

27 

28 

29 

30 

31 

32 

33 

34 

35 

36 

37 

38 

39 

40 

41 


{w,p{w)} ^ extractMinCQ); 
con{w) ^ -\-uj{z,w); 

if con{w) = p{w) then 
update d' and d"; 
if d(w) = oo then 
I vis{w) vis{w) + 1; 
end 

d(w) p{w); a{w) ^ 0; 
foreach incident edge (z, w) do 
if d(w) = d(z) + oj(z, w) then 
I a(w) <— o-(w) + o-(z); 

end 

if d(z) > d(w) -i-cj(z,w) then 
I Q <— insertOrDecreaseKeyCz,p(z) = d(w) + aj(z, w)); 


end 


end 


end 

else 


if d(w) 7 ^ cxD then 

vis(w) vis(w) — 1; 
if vis(w)=0 then 
I insert w into U\ 

end 

if con(w) 7 ^ oo then 

Q insertOrDecreaseKey(w,p(w) = con{w))\ 
foreach incident edge (z, w) do 
if d{z) = d{w) + uj{w, z) then 
I Q insertOrDecreaseKey(j*,p(z) = d(w) + u}{z,w))\ 

end 
end 

d{w) ■<— oo; 


end 


end 


end 


end 
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Algorithm 5: SSSP update for unweighted graphs (updateSSSP-U) 


1 Assumption: color{w) = white 'iw € V’, 

2 QO array of empty queues; 

3 foreach e = {u, v} € P, d(u) < d{v) do 

4 I k d{v) + 1; enqueue v Q[k]’, 

5 end 


6 k 


1 ; 


7 

8 
9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 
21 
22 

23 

24 

25 

26 

27 

28 

29 

30 

31 

32 

33 

34 

35 

36 

37 

38 

39 

40 

41 

42 

43 


while there are nodes in Q[j],j > k do 
while Q[k] 7^ 0 do 
dequeue w Q[k]; 
if color{w) = black then continue; 
con{w) ^ d{z) + 1; 

if con(w) — k then 
update d' and d"; 

if d{w) = 00 then vis{w) <— vis{w) + 1; 
d{w) k\ a{w) 0; coloriw) <r- black; 
foreach incident edge (z, w) do 
if d(w) = d{z) + 1 then 
I a{w)a{w) + a{z); 
end 

if d{z) > k then 
I enqueue a —>■ Q[k + 1]; 

end 

end 

end 

else 

if d{w) 7^ 00 then 
d{w) -h- oo; 
vis{w) -h- vis(w) — 1; 
if vis(w)=0 then 
I insert w into U ; 
end 

if coniw) 7^ 00 then 

enqueue w —>■ Q[con{w)]; 
foreach ineident edge {z, w) do 
if d{z) > k then 
I enqueue z ^ Q[k + 1]; 
end 
end 
I end 
end 
end 
end 

k i — k X; 


44 end 


45 Set to white all the nodes that have been in Q; 


17 






A. DESCRIPTION OF THE FULLY-DYNAMIC ALGORITHMS 


A.3 Dynamic SSSP update for unweighted graphs 

Algorithmshows the pseudocode. As in Algorithm]^ we first scan the batch 
(Lines |^-|^ and insert the nodes in the queues. Then (Lines [^- [44|, we scan 
the queues in order of increasing distance from s, in a fashion similar to that 
of a priority queue. In order not to insert a node in the queues multiple times, 
we use colors: Initially we set all the nodes to white and then we set a node w 
to black only when we find the final distance of w (i. e. when we set d{w) to k) 
(Line [T^. Black nodes extracted from a queue are then skipped (Line [l0|. At 
the end we reset all nodes to white. 


A.4 Fully-dyuamic BC approximatiou 


Similarly to lA and lAW, we replace the r sampled paths between vertex pairs 
(s, t) with new shortest paths between the same vertex pairs. However, here 
we also check whether VD (and consequently the number r of samples) has 
increased after the batch of edge updates. If so, we sample additional paths 
(computing new SSSPs from scratch) according to the new value of r. Instead 
of updating VD and then the paths in two successive steps, we use the SSSPs 
from the r source nodes s to compute and update also VD, computing new 
SSSPs only for the components that are not covered by any of the source nodes. 
In the initialization (Algorithm |^, we first compute the r SSSP, like in RK 
(Lines [4| - [I^ . However, we also check which nodes have been visited, as in 
Algorithmic While we compute the r SSSPs, in addition to the distances and 
number of shortest paths, we also compute a VD approximation for each of the 
r source nodes and increase vis{v) of all the nodes we visit during the sources 
with initSSSPVD (LinejC). Since it is possible that the r shortest paths do not 
cover all the components of G, we compute an additional VD approximation for 
nodes in the unvisited components, like in Algorithm(Lines 21-281. Basically 
we can divide the SSSPs into two sets: the set R of SSSPs used to compute 
the r shortest paths and the set R' of SSSPs used for a VD approximation 
in the components that were not scanned by the initial R SSSPs. We call r' 
the number of the SSSPs in R'. The BC update after a batch is described in 
Algorithmic First (Lines - [l^, we recompute the shortest paths like in our 
incremental algorithms IA and I AW |3]: we update the SSSPs from each source 
node s in i? and we replace the old shortest path with a new one (subtracting 
l/r to the nodes in the old shortest path and adding 1/r to those in the new 
shortest path). Notice that here we do not store the predecessors so we need 
to recompute them (Lines 11 and 171. Instead of using an incremental SSSP 
algorithm like in lA-lAW, here we use the fully-dynamic updateSSSPVD that 
updates also the VD approximation and updates and keeps track of the nodes 


that become unvisited. Then (Lines311, we add a new SSSP to R' for each 


component that has become unvisited (by both R and R'). After this, we have 
at least a VD approximation for each component of G. We take the maximum 
over all these approximations and recompute the number of samples r (Lines 
- 33l. If r has increased, we need to sample new paths and therefore new SSSPs 
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to add to R. Finally, we normalize the BC scores, i. e. we multiply them by the 
old value of r divided by the new value of r (Line 371. 


Algorithm 6: BC initialization 


1 foreach node v € V do 

2 I cb {v) •<— 0; vis{v) 0; 

3 end 

4 FZ) •<—getApproxVertexDiameter(G); 

5 r ^ (c/e2)([log2(FZ> - 2)J +ln(l/5)); 

6 for i ■<— 1 to r do 


{si,ti) e— samplellniformNodePair (F) ; 

VDi ^ initSSSPVD(G,Si,); 

V ti; 

P(i) empty list; 

Poi (v) {z : {z,v} G E n dsi (v) = ds, (z) + uj{{z, u})}; 
while Psiiv) A {si} do 

sample 2 : £ Psi{v) with probability as-{z)/as^{v); 
cb{z) g- cb{z) + Ijr-, 
add 2 —>■ p(i); w •<— 2 ; 

Psi{v) G- {2 : {z,v} G Ends-{v) = ds-{z) +a;({2,u})}; 

end 


7 

8 
9 

10 
11 
12 

13 

14 

15 

16 

17 

18 end 

19 G ^ F; 

20 i ^ r + 1; 

21 while G A 0 do 

22 extract s' from G ; 

23 if vis{s') = 0 then 

24 s'i-G- s'; 

25 VDi G- initApprVDCG, s() ; 

26 i i + 1; 

27 end 

28 end 

29 r' <— r — i — 1’, 

30 return {(i;,cs(a)) : v G F} 
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Algorithm 7: Dynamic update of BC approximation (DA) 

1 !/■«—[]; 

2 for i 1 to r do 

3 df‘^ ^ ds,{uy, 

4 

// updateSSSPVD updates vis, inserts all v : vis{v) = 0 into U and 
updates the VD approximation 

5 DDi e- updateSSSPVD (G, Si, /3); 

//we replace the shortest path between Si and ti 

6 foreach w £ P(i) do 

7 I cb (w) ■«— cb{w) — l/r-, 

8 end 

9 V U; 

10 empty list; 

11 Psi{v) ■«- {z ■■ {z,v} € Ends-{v) = ds,{z) +w({t,ii})}; 

12 while Psiiv) A {•Si} do 

13 sample t £ Psi{v) with probability = (t)/crs. (t); 

14 cb{z)Cb{z) + l/r- 

15 add t to p(i); 

16 V ■(— Z-, 

17 Psiiv) <- {z : {z,v} £ EndsAv) = +w({t,i’})}; 

18 end 

19 end 

20 for i r + 1 to r + r' do 

21 I l/Di updateApprVD(G, Si,/3) ; 

22 end 

23 i r + r' + 1; 

24 while 1/ A 0 do 

25 extract s' from U ; 

26 if vis(s') = 0 then 

27 s' ^ s'; 

28 VDi -(r- initApprVDCG, Si); 

29 i i + 1; r' r' + 1; 

30 end 

31 end 

// compute the maximum over all the VDi computed by updateApprVD 

32 VD <— maxi^i ,,, VDi; 

33 w ^ (c/e^)(llog 2 (VD - 2)J + ln(l/5)); 

34 if r„e„ > r then 

35 sample new paths; 

36 foreach v G V do 

37 I cb{v)Cb{v) ■ r/r 

new 

38 end 

39 T i tnew; 

40 end 

41 return {(v,cs(w)) : v € V} 
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B. OMITTED PROOFS 


B Omitted proofs 

B.l Proof of Proposition 

Proof. To prove the first inequality, we can notice that (f"{x,y) > d{x,y) for 
all x,y £ V, since all the edges of T are contained in those of G. Also, since 
every edge has weight at least w, d{x,y) > {\pxy \ — 1) ■ w. Therefore, d'^{x,y) > 
{\Pxy \ — 1) • w, which can be rewritten as \pxy\ < 1 + for all x,y £ V. 

Thus, VD = maxx,y \pxy\ < 1 + {ma.Xx^yd'^{x,y))/ui < 1 + ^ {s,0+d Ux) ^ 

1 + ^ expression equals VD by definition. 

To prove the second inequality, we first notice that d{s,u) < and 

analogously d(s,t>) < (|ps«|-l)-a;. Consequently, VD < l + {\psu\ + \Psv\-2)-‘^ < 

2 ■ \Psu I • ^, supposing that \psu \ > \Psv \ without loss of generality. By definition 

of VD, \psu\ < VD. Therefore, VD <2- VD ■^. □ 

B.2 Proof of Lemma [l] 

Proof. In the initial scan of the batch (Lines [2Q, we scan the nodes of the batch 
and insert the affected nodes into Q (or update their value). This requires at 
most one heap operation (insert or decrease-key) for each element of jd, therefore 
0(|/3| log |/3|) time. When we extract a node w from Q, we have two possibilities: 
(i) con(w) = p{w) (Lines|^-[^ or (ii) con{w) > p{w) (Lines[^-[^. In the first 
case, we scan the neighbors of w and perform at most one heap operation for 
each of them (Lines [I^-pT|). In the second case, this happens only if d{w) ^ oo. 
Therefore, we can perform up to one heap operation per incident edge of w, for 
each extraction of w in which d{w) oo or con(w) = p{w). How many times 
can an affected node w be extracted from Q with d{w) ^ 00 or con(w) = p{w)l 
If the first time we extract w, con(w) is equal to p(w) (case (i)), then the final 
value of d(w) is reached and w is not inserted into Q anymore. If the first time 
we extract w, con(w) is greater than p(w) (case (ii)), w can be inserted into 
the queue again. However, his distance is set to 00 and therefore no additional 
operations are performed, until d(w) becomes less than 00 . But this can happen 
only in case (i), after which d(w) reaches its final value. To summarize, each 
affected node w can be extracted from Q with d(w) ^ 00 or con(w) = p(w) at 
most twice and, every time this happens, at most one heap operation per incident 
edge of w is performed. The complexity is therefore 0(|/3| log |/3| -I- ||A|| log ||A||). 

□ 


B.3 Proof of Lemma 1^ 

Proof. The complexity of the initialization (Lines [^- |§_of Algorithm^is 0(|/3|), 
as we have to scan the batch. In the main loop (Lineslm- 441, we scan all the list 
of queues, whose final size is dmax- Every time we extract a node w whose color 
is not black, we scan all the incident edges, therefore this operation is linear 
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in the number of neighbors oi w. If the first time we extract w (say at level 
k) con{w) is equal to k, then w will be set to black and will not be scanned 
anymore. If the first time we extract w, con(w) is instead greater than k, w 
will be inserted into the queue at level con{w) (if con{w) < oo). Also, other 
inconsistent neighbors of w might insert w in one of the queues. However, after 
the first time w is extracted, its distance is set to oo, so its neighbors will not 
be scanned unless con{w) = k, in which case they will be scanned again, but for 
the last time, since w will be set to black. To summarize, each affected node and 
its neighbors can be scanned at most twice. The complexity of the algorithm is 
therefore 0(|/3| + |lA|j + dmax)- □ 


B.4 Correctness of Algorithm and Algorithm 

Lemma 3. At the end of Algorithm^ vis{v) = 1, Wv £ V and exactly one VD 
approximation is computed for each connected component ofG. 


Proof. Let v be any node. Then v must be scanned by at least one source node si 
in the while loop (LinesIn fact, either v is visited by some Si before v is 
extracted from U, or vis{v) = 0 at the moment of the extraction and v becomes 
a source node itself. This implies that vis{v) >1, Vu G V. On the other hand, 
vis{v) cannot be greater than 1. In fact, let us assume by contradiction that 
vis{v) > 1. This means that there are at least two source nodes Si and Sj {i < j, 
w.l.o.g.) that are in the same connected component as v. Then also Si and Sj 
are in the same connected component and Sj is visited during the SSSP search 
from Si- Then vis{sj) = 1 before Sj is extracted from U and Sj cannot be a 
source node. Therefore, vis{v) is exactly equal to 1 for each v G V, which means 
that exactly one VD approximation is computed for the connected component 
of each v, i.e. exactly one VD approximation is computed for each connected 
component of G. □ 


Lemma 4. Let C = {C [,..., G(,, } be the set of connected components of G after 
the update. Algorithm^updates or computes exactly one VD approximation for 
each G[ G G'. 

Proof. Let G = {Gi,..., G„^} be the set of connected components before the up¬ 
date. Let us consider three basic cases (then it is straightforward to see that the 
proof holds also for combinations of these cases): (i) Gi G G is also a component 
of G', (ii) Gi G G and Cj G G merge into one component G(, of G', (iii) Gi £ G 
splits into two components G( and G(, of G'. In case (i), the VD approximation 
of Gi is updated exactly once in the for loop (Lines[^-|^. In case (ii), (assuming 
i < j, w.l.o.g.) the VD approximation of G(. is updated in the for loop from the 
source node Si G Gi. In its SSSP search, Si visits also Sj £ Cj, increasing vis{sj). 
Therefore, Sj is skipped and exactly one VD approximation is computed for G(,. 
In case (iii), the source node Si £ Gi belongs to one of the components (say G') 
after the update. During the for loop, the VD approximation is computed for G' 
via Si. Also, for all the nodes v in G(,, vis{v) is set to 0 and v is inserted into U. 


Then some source node s). G G(, must be extracted from U in Line 12 and a VD 
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approximation is computed for Since all the nodes in are set to visited 
during the search, no other VD approximations are computed for (7^. □ 


B.5 Proof of Theorem [T] 

Proof. In the first part (Lines [^ - [^ of Algorithm , we update an SSSP with 
updateApprVD-W or updateApprVD-U for each source node Si such that vis{si) is 
not greater than 1. Therefore the complexity of the first part is 0{nc ■\P\ log |/3| + 

E”=i 11^^*^ 11 log 1II) in weighted graphs and 0(nc • |/3| +X)r=i H^^'^ll+dmL) 
in unweighted, for Lemmas and Only some of the affected nodes (those 
whose distance from a source node becomes equal to o o ) ar e inserted into the 
queue U. Therefore the cost of scanning U in Lines 11 - 18 is 0(X)r=i 11^*'*^ID- 


New SSSP searches are computed for new components that are not covered by 
the existing source nodes anymore. However, also such searches involve only 
the affected nodes and each affected node (and its incident edges) is scanned 
at most once during the search. Therefore, the total cost is 0{nc ■ |/3| log |/3| + 

ll^^*Dl log ll^^'Dl) for weighted graphs and 0(nc• |/3|+Z)”=i \ \A‘''-'>\\+d!fi}ax) 
for unweighted graphs. □ 


B.6 Proof of Theorem [ 2 ] 

Proof. Let G be the old graph and G' the modified graph after the batch of edge 
updates. Let p'^y be a shortest path of G" between nodes x and y. To prove the 
theoretical guarantee, we need to prove that the probability of any sampled path 
is equal to p'^y (i.e. that the algorithms adds 1/r' to the nodes in p'^y) is 
■ Algorithm replaces the first r shortest paths with other shortest 
paths ■■■TP\r) between the same node pairs (Lines [I^ - [I^ using Algorithm 
4.1 of 13, for which it was already proven that Pr(p'^j,j = p'^y) = cr’\y) El 

Theorem 4.1]. The additional Ar shortest paths (Line are recomputed from 
scratch with RK, therefore also in this case Pr(p'^j,^ = p'^y) = ^>\y^ by 

Lemma 7 of [18] . □ 

B.7 Proof of Theorem [3 

Proof. Let Ar' be the difference between the values of r' before and after the 
batch. Let us start from the simplest case: the graph G is such that there is 
(before and after the update) one sample in each component and VD does not 
increase after the update. This case includes, for example, connected graphs sub¬ 
ject to a batch of only edge insertions, or any batch that neither splits the graph 
into more components nor increases VD. In this case, Ar — 0 and Ar' — 0 and 
we only need to update the r old shortest paths. Then, the total complexity is 
0{r- 1/3| +X)i=i(ll^^*Dl +'^max)), where A^) is the set of nodes affected in the ith 
SSSP, and dVax is the maximum distance in the ith SSSP. In general graphs, we 
might need to sample new paths for the betweenness approximation (Ar > 0) 
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C. ADDITIONAL EXPERIMENTAL RESULTS 


and/or sample paths in new components that are not covered by any of the sam¬ 
pled paths {Ar' > 0). Then, the complexity for the betweenness approximation 
update is 0{r ■ |/3| -|- + dmlx)) + 0{Ar{\V\ + |i?|)). The VD update 

requires 0(r' • |/3| + I]i=i(l11 + dmax)) to update the VD approximation in 
the already covered components and the new ones, where 

Vi and Ei are nodes and edges of the ith component, respectively. □ 


C Additional Experimental Results 
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Fig. 3: Speedups on RK in real unweighted graphs under random updates. 



Real 

Random 

Time [sj 

Speedups 

Time [sJ 

Speedups 

Graph 

\m = 1 

1^1 = 1024 

1! 

1^1 = 1024 

\m = 1 

1^1 = 1024 

\m = 1 

1^1 = 1024 

repliesDigg 

0.053 

3.032 

605.18 

14.24 

0.049 

3.046 

658.19 

14.17 

emailSIashdot 

0.790 

5.387 

50.81 

16.12 

0.716 

5.866 

56.00 

14.81 

emailLinux 

0.324 

24.816 

5780.49 

75.40 

0.344 

24.857 

5454.10 

75.28 

facebookPosts 

0.029 

6.672 

2863.83 

11.42 

0.029 

6.534 

2910.33 

11.66 

emailEnron 

0.050 

9.926 

3486.99 

24.91 

0.046 

50.425 

3762.09 

4.90 


Table 3: Times and speedups of DAW on RK in weighted real graphs under real 
dynamics and random updates, for batch sizes of 1 and 1024. 



Hyperbolic 

Time |sj 

Speedups 

Number ot edges 

V\ = 1 

= 1024 

\p\ = 1 

1/^1 = 1024 

m = 20000 

0.005 

0.195 

99.83 

2.79 

m — 50000 

0.002 

0.152 

611.17 

10.21 

m = 200000 

0.015 

0.288 

422.81 

22.64 

m = 500000 

0.012 

0.339 

1565.12 

51.97 

m — 2000000 

0.049 

0.498 

2419.81 

241.17 

m ^ 5000000 

0.083 

0.660 

4716.84 

601.85 

m = 20000000 

0.006 

0.401 

304338.86 

5296.78 


Table 4: Times and speedups of DA on RK in hyperbolic unit-disk graphs, for 
batch sizes of 1 and 1024. 
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